Biased Bagging for Unsupervised Domain Adaptation

نویسندگان

Twan van Laarhoven

Elena Marchiori

چکیده

Unsupervised Domain Adaptation (DA) is used to automatize the task of labeling data: an unlabeled dataset (target) is annotated using a labeled dataset (source) from a related domain. We cast domain adaptation as the problem of finding stable labels for target examples. A new definition of label stability is proposed, motivated by a generalization error bound for large margin linear classifiers: a target labeling is stable when, with high probability, a classifier trained on a random subsample of the target with that labeling yields the same labeling. We find stable labelings using a random walk on a directed graph with transition probabilities based on labeling stability. The majority vote of those labelings visited by the walk yields a stable label for each target example. The resulting domain adaptation algorithm is strikingly easy to implement and apply: It does not rely on data transformations, which are in general computational prohibitive in the presence of many input features, and does not need to access the source data, which is advantageouswhen data sharing is restricted. By acting on the original feature space, our method is able to take full advantage of deep features from external pre-trained neural networks, as demonstrated by the results of our experiments.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning

Domain adaptation is a powerful technique given a wide amount of labeled data from similar attributes in different domains. In real-world applications, there is a huge number of data but almost more of them are unlabeled. It is effective in image classification where it is expensive and time-consuming to obtain adequate label data. We propose a novel method named DALRRL, which consists of deep ...

متن کامل

Bagging-based System Combination for Domain Adaptation

Domain adaptation plays an important role in multi-domain SMT. Conventional approaches usually resort to statistical classifiers, but they require annotated monolingual data in different domains, which may not be available in some cases. We instead propose a simple but effective bagging-based approach without using any annotated data. Large-scale experiments show that our new method improves tr...

متن کامل

Image alignment via kernelized feature learning

Machine learning is an application of artificial intelligence that is able to automatically learn and improve from experience without being explicitly programmed. The primary assumption for most of the machine learning algorithms is that the training set (source domain) and the test set (target domain) follow from the same probability distribution. However, in most of the real-world application...

متن کامل

Overcoming Dataset Bias: An Unsupervised Domain Adaptation Approach

Recent studies have shown that recognition datasets are biased. Paying no heed to those biases, learning algorithms often result in classifiers with poor crossdataset generalization. We are developing domain adaptation techniques to overcome those biases and yield classifiers with significantly improved performance when generalized to new testing datasets. Our work enables us to continue to har...

متن کامل

Importance weighting and unsupervised domain adaptation of POS taggers: a negative result

Importance weighting is a generalization of various statistical bias correction techniques. While our labeled data in NLP is heavily biased, importance weighting has seen only few applications in NLP, most of them relying on a small amount of labeled target data. The publication bias toward reporting positive results makes it hard to say whether researchers have tried. This paper presents a neg...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1706.05335 شماره

صفحات -

تاریخ انتشار 2017

Biased Bagging for Unsupervised Domain Adaptation

نویسندگان

چکیده

منابع مشابه

Deep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning

Bagging-based System Combination for Domain Adaptation

Image alignment via kernelized feature learning

Overcoming Dataset Bias: An Unsupervised Domain Adaptation Approach

Importance weighting and unsupervised domain adaptation of POS taggers: a negative result

عنوان ژورنال:

اشتراک گذاری